Skip to content

Conversation

@ystaticy
Copy link
Contributor

@ystaticy ystaticy commented Feb 5, 2026

What problem does this PR solve?

Issue Number: Close #10219

What is changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

None.

Summary by CodeRabbit

  • New Features

    • Added a gauge to track keyspace counts per keyspace group for improved observability.
  • Tests

    • Added test coverage validating keyspace count metric behavior and cleanup.
  • Chores

    • Updated a dependency declaration and refactored internal logic to improve reliability and maintainability.

Signed-off-by: ystaticy <y_static_y@sina.com>
Signed-off-by: ystaticy <y_static_y@sina.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. labels Feb 5, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qiuyesuifeng for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Feb 5, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

Hi @ystaticy. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 5, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 5, 2026

📝 Walkthrough

Walkthrough

Adds a per-keyspace-group gauge metric and exposes a setter; refactors bootstrap keyspace ID resolution into internal helpers to avoid import cycles; updates keyspace-group save to update metrics post-commit; adds tests and a go.mod dependency change.

Changes

Cohort / File(s) Summary
Dependency Management
go.mod
Changed github.com/prometheus/client_model from indirect to direct requirement (removed // indirect).
Metrics Implementation
pkg/tso/metrics.go
Added a new gauge metric for keyspace-group keyspace counts, exported setter SetKeyspaceGroupKeyspaceCountGauge, internal helpers to set/delete per-group values, and registration of the metric.
Metric Integration in Persistence
pkg/keyspace/tso_keyspace_group.go
After committing keyspace-group saves, call the new tso.SetKeyspaceGroupKeyspaceCountGauge for each group to update gauges; capture and return transaction error before post-commit updates.
Bootstrap ID Refactor
pkg/tso/keyspace_group_manager.go, pkg/utils/tsoutil/tso_request.go
Introduced internal getBootstrapKeyspaceID using kerneltype.IsNextGen() and replaced direct keyspace.GetBootstrapKeyspaceID() calls to avoid import cycles; added nil-check for kgm.metrics in deletion path.
Tests
pkg/tso/keyspace_group_manager_test.go
Added TestKeyspaceListLengthMetric to verify gauge is set and removed via the new metric APIs.

Sequence Diagram(s)

sequenceDiagram
  participant API as API/Caller
  participant KVS as KVStore/Txn
  participant DB as Storage
  participant TSO as TSO metrics

  API->>KVS: saveKeyspaceGroups(groups)
  KVS->>DB: RunInTxn(persist groups)
  DB-->>KVS: commit success
  KVS-->>API: return nil
  Note over API,KVS: Post-commit side-effects
  loop for each group
    API->>TSO: SetKeyspaceGroupKeyspaceCountGauge(groupID, count)
    TSO-->>TSO: update gauge vector[label=groupID]
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰
I hopped through imports, nudged a gauge so bright,
Saved groups in order, then set metrics right.
Bootstrap found its home without a cycle fight,
Gauges now hum softly through day and night.

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Changes to go.mod dependency and bootstrap keyspace ID refactoring appear unrelated to the primary objective of adding TSO keyspace group metrics. Clarify or separate the go.mod dependency change and bootstrap keyspace ID refactoring as they may be out of scope for the metrics-only issue.
Title check ❓ Inconclusive The title 'Tso keyspace group keyspace list length' is partially related to the changeset but lacks specificity about what was changed. Consider a more descriptive title following the template format like 'tso, keyspace: add TSO keyspace group keyspace list length metrics' to clarify the main change.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description follows the template format but includes an incomplete commit-message block and minimal details explaining the implementation.
Linked Issues check ✅ Passed The PR implements metrics for TSO keyspace groups including SetKeyspaceGroupKeyspaceCountGauge and metric updates, aligning with issue #10219's objective to add TSO keyspace group metrics.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: ystaticy <y_static_y@sina.com>
Signed-off-by: ystaticy <y_static_y@sina.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/tso/keyspace_group_manager.go (1)

915-956: ⚠️ Potential issue | 🟠 Major

NextGen reserve keyspace can be removed/inserted incorrectly.
When getBootstrapKeyspaceID returns SystemKeyspaceID, checkReserveKeyspace assumes the reserved key is at index 0 (newKeyspaces[1:]) and inserts at the front, which can drop the wrong keyspace and leave the list unsorted. Remove by value and keep the list ordered.

🐛 Suggested fix to keep reserved key handling order-safe
func (kgm *KeyspaceGroupManager) checkReserveKeyspace(newGroup *endpoint.KeyspaceGroup, newKeyspaces []uint32, reserveKeyspace uint32) {
	if newGroup.ID == constant.DefaultKeyspaceGroupID {
		if _, ok := newGroup.KeyspaceLookupTable[reserveKeyspace]; !ok {
			log.Warn("this keyspace is not in default keyspace group. add it back", zap.Uint32("keyspace", reserveKeyspace))
			kgm.keyspaceLookupTable[reserveKeyspace] = newGroup.ID
			newGroup.KeyspaceLookupTable[reserveKeyspace] = struct{}{}
-			newGroup.Keyspaces = make([]uint32, 1+len(newKeyspaces))
-			newGroup.Keyspaces[0] = reserveKeyspace
-			copy(newGroup.Keyspaces[1:], newKeyspaces)
+			newGroup.Keyspaces = append(append([]uint32{}, newKeyspaces...), reserveKeyspace)
+			sort.Slice(newGroup.Keyspaces, func(i, j int) bool {
+				return newGroup.Keyspaces[i] < newGroup.Keyspaces[j]
+			})
		}
	} else {
		if _, ok := newGroup.KeyspaceLookupTable[reserveKeyspace]; ok {
			log.Warn("this keyspace is in non-default keyspace group. remove it", zap.Uint32("keyspace", reserveKeyspace))
			kgm.keyspaceLookupTable[reserveKeyspace] = constant.DefaultKeyspaceGroupID
			delete(newGroup.KeyspaceLookupTable, reserveKeyspace)
-			newGroup.Keyspaces = newKeyspaces[1:]
+			filtered := make([]uint32, 0, len(newKeyspaces)-1)
+			for _, ks := range newKeyspaces {
+				if ks != reserveKeyspace {
+					filtered = append(filtered, ks)
+				}
+			}
+			newGroup.Keyspaces = filtered
		}
	}
}
pkg/keyspace/tso_keyspace_group.go (1)

361-372: ⚠️ Potential issue | 🟡 Minor

Defer gauge metric updates until after the transaction commits.

Updating SetKeyspaceGroupKeyspaceCountGauge inside the transaction callback can leave metrics inconsistent with persisted data. If a later iteration's LoadKeyspaceGroup or SaveKeyspaceGroup fails, the transaction rolls back but earlier gauge updates persist, reflecting a state that was never written to storage.

Collect the gauge updates and apply them after RunInTxn succeeds.

@bufferflies bufferflies added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Feb 5, 2026
Signed-off-by: ystaticy <y_static_y@sina.com>
Signed-off-by: ystaticy <y_static_y@sina.com>
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

@ystaticy: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-unit-test-next-gen-2 37cf092 link true /test pull-unit-test-next-gen-2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 85.18519% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.66%. Comparing base (2139230) to head (37cf092).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10218      +/-   ##
==========================================
+ Coverage   78.63%   78.66%   +0.03%     
==========================================
  Files         520      520              
  Lines       70089    70112      +23     
==========================================
+ Hits        55112    55152      +40     
+ Misses      10989    10986       -3     
+ Partials     3988     3974      -14     
Flag Coverage Δ
unittests 78.66% <85.18%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. dco-signoff: yes Indicates the PR's author has signed the dco. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TSO keyspace group metrics collect len(group.keyspaces)

2 participants